Speaker Diarization for Multi-microphone Meetings Using Only Between-Channel Differences

نویسندگان

  • José Manuel Pardo
  • Xavier Anguera Miró
  • Chuck Wooters
چکیده

We present a method to extract speaker turn segmentation from multiple distant microphones (MDM) using only delay values found via a crosscorrelation between the available channels. The method is robust against the number of speakers (which is unknown to the system), the number of channels, and the acoustics of the room. The delays between channels are processed and clustered to obtain a segmentation hypothesis. We have obtained a 31.2% diarization error rate (DER) for the NIST ́s RT05s MDM conference room evaluation set. For a MDM subset of NIST ́s RT04s development set, we have obtained 36.93% DER and 35.73% DER. Comparing those results with the ones presented by Ellis and Liu [8], who also used between-channels differences for the same data, we have obtained 43% relative improvement in the error rate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker diarization for multiple distant microphone meetings: mixing acoustic features and inter-channel time differences

Speaker diarization for recordings made in meetings consists of identifying the number of participants in each meeting and creating a list of speech time intervals for each participant. In recently published work [7] we presented some experiments using only TDOA values (Time Delay Of Arrival for different channels) applied to this task. We demonstrated that information in those values can be us...

متن کامل

Robust speaker diarization for meetings: ICSI RT06s evaluation system

In this paper we present the ICSI speaker diarization system submitted for the NIST Rich Transcription evaluation (RT06s) [1] conducted on the meetings environment. This is a set of yearly evaluations which in the last two years have included speaker diarization of two kinds of distinct meetings: conference room and lecture room. The system presented focuses on being robust to changes in the me...

متن کامل

Speaker segmentation and clustering in meetings

This paper describes the issue of automatic speaker segmentation and clustering for natural, multi-speaker meeting conversations. Two systems were developed and evaluated in the NIST RT-04S Meeting Recognition Evaluation, the Multiple Distant Microphone (MDM) system and the Individual Headset Microphone (IHM) system. The MDM system achieved a speaker diarization performance of 28.17%. This syst...

متن کامل

Multi-stage Speaker Diarization for Conference and Lecture Meetings

The LIMSI RT-07S speaker diarization system for the conference and lecture meetings is presented in this paper. This system builds upon the RT06S diarization system designed for lecture data. The baseline system combines agglomerative clustering based on Bayesian information criterion (BIC) with a second clustering using state-of-the-art speaker identification (SID) techniques. Since the baseli...

متن کامل

Live Speaker Identification in Meetings – “who Is Speaking Now?”

The following paper presents an application that fuses the currently artificially separated tasks of speaker identification and speaker diarization. The presented method allows online identification of who is currently speaking using a single far-field microphone in a meeting scenario. It is able to recognize the current speaker after any two seconds of speech. An evaluation of the robustness o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006